SciSumm: A Multi-Document Summarization System for Scientific Articles

نویسندگان

  • Nitin Agarwal
  • Ravi Shankar Reddy
  • Kiran G. V. R.
  • Carolyn Penstein Rosé
چکیده

In this demo, we present SciSumm, an interactive multi-document summarization system for scientific articles. The document collection to be summarized is a list of papers cited together within the same source article, otherwise known as a co-citation. At the heart of the approach is a topic based clustering of fragments extracted from each article based on queries generated from the context surrounding the co-cited list of papers. This analysis enables the generation of an overview of common themes from the co-cited papers that relate to the context in which the co-citation was found. SciSumm is currently built over the 2008 ACL Anthology, however the generalizable nature of the summarization techniques and the extensible architecture makes it possible to use the system with other corpora where a citation network is available. Evaluation results on the same corpus demonstrate that our system performs better than an existing widely used multi-document summarization system (MEAD).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards Multi-Document Summarization of Scientific Articles:Making Interesting Comparisons with SciSumm

We present a novel unsupervised approach to the problem of multi-document summarization of scientific articles, in which the document collection is a list of papers cited together within the same source article, otherwise known as a co-citation. At the heart of the approach is a topic based clustering of fragments extracted from each co-cited article and relevance ranking using a query generate...

متن کامل

LaSTUS/TALN @ CLSciSumm-17: Cross-document Sentence Matching and Scientific Text Summarization Systems

In recent years there has been an increasing interest in approaches to scientific summarization that take advantage of the citations a research paper has received in order to extract its main contributions. In this context, the CL-SciSumm 2017 Shared Task has been proposed to address citation-based information extraction and summarization. In this paper we present several systems to address thr...

متن کامل

Overview of the CL-SciSumm 2016 Shared Task

The CL-SciSumm 2016 Shared Task is the first medium-scale shared task on scientific document summarization in the computational linguistics (CL) domain. The task built off of the experience and training data set created in its namesake pilot task, which was conducted in 2014 by the same organizing committee. The track included three tasks involving: (1A) identifying relationships between citing...

متن کامل

University of Mannheim @ CLSciSumm-17: Citation-Based Summarization of Scientific Articles Using Semantic Textual Similarity

The number of publications is rapidly growing and it is essential to enable fast access and analysis of relevant articles. In this paper, we describe a set of methods based on measuring semantic textual similarity, which we use to semantically analyze and summarize publications through other publications that cite them. We report the performance of our approach in the context of the third CL-Sc...

متن کامل

The CL-SciSumm Shared Task 2017: Results and Key Insights

The CL-SciSumm Shared Task is the first medium-scale shared task on scientific document summarization in the computational linguistics (CL) domain. In 2017, it comprised three tasks: (1A) identifying relationships between citing documents and the referred document, (1B) classifying the discourse facets, and (2) generating the abstractive summary. The dataset comprised 40 annotated sets of citin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011